Combining the flexibility of speech synt pre-recorded audio: a comparison of two ap

نویسنده

Wael Hamza

چکیده

Many applications of TTS incorporate both unpredictable words, which require the flexibility of TTS, and static phrases, for which the quality of recorded speech is unmatched by TTS. “Phrase-splicing” TTS attempts to provide the optimal combination of the two, by customizing concatenative TTS to such applications by incorporating application-specific recordings at the word or phrase level while resorting to smaller-unit synthesis to fill the gaps not covered by those recordings. In the past, we have achieved this by using a word-level search on the application-specific recordings followed by a generalpurpose TTS search, in our case using sub-phonetic units, to fill the gaps. However, recent trends toward larger-unit roles in general-purpose TTS suggest a single-search approach for phrase splicing. A listening test shows that we achieve at least as high quality with the new one-search algorithm as with twosearch.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring EFL Learners’ Use of Formulaic Sequences in Pragmatically Focused Role-play Tasks

Communicative language use largely entails regular patterns consisting of pre-constructed phrases or sequences. These sequences have been examined by many researchers to find the situation-based formulas which may help L2 learners follow a possibly more target-like speaking system. This study, therefore, explored two categories of formulaic expressions including speech formulas and situation-bo...

متن کامل

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...

متن کامل

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...

متن کامل

Speech synthesis by structured segments, using temporal decomposition and a glottal excitation

Classical speech synthesis systems either concatenate diphone-like tabulated pattems or reconstmct speech parameters according to pre-defmed mles. Both techniques show drawbacks : the fonner lacks flexibility while the lauer is highly time-consuming_ to built. We propose an intennediate technique using structured segments : segmental units are still resorted to, but they are automatically analy...

متن کامل

The Efficacy of Audio Input Flooding Tasks on Learning Grammar: Uptake of Present Tense

This study sought to probe the role of input flooding through listening tasks on the uptake of simple present tense and the present progressive tense among pre - intermediate English a s Foreign Language ( EFL ) learners. To comply with the objective, an experimental design was adopted. 55 pre - intermediate learners participated in the study. They were randomly divided into one control group, ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Combining the flexibility of speech synt pre-recorded audio: a comparison of two ap

نویسنده

چکیده

منابع مشابه

Exploring EFL Learners’ Use of Formulaic Sequences in Pragmatically Focused Role-play Tasks

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

Speech synthesis by structured segments, using temporal decomposition and a glottal excitation

The Efficacy of Audio Input Flooding Tasks on Learning Grammar: Uptake of Present Tense

عنوان ژورنال:

اشتراک گذاری